ESQL: Type conflict resolution in unmapped-fields load#143693
ESQL: Type conflict resolution in unmapped-fields load#143693GalLalouche merged 165 commits intoelastic:mainfrom
Conversation
PotentiallyUnmappedKeywordEsField (used for fields loaded from _source when unmapped_fields="load") had isAggregatable=true, which caused isPushableFieldAttribute to short-circuit past the SearchStats check and push filters down to Lucene. On shards where the field is not indexed, the Lucene query returns no results instead of letting the compute engine evaluate the filter on _source-loaded values. Guard isPushableFieldAttribute against PotentiallyUnmappedKeywordEsField so these fields are always filtered in the compute engine. Co-authored-by: Cursor <cursoragent@cursor.com>
The isPushableFieldAttribute fix (rejecting PotentiallyUnmappedKeywordEsField) already covers sort pushdown since PushTopNToSource uses the same method. This test verifies correct sort order when a field is mapped in one index but unmapped in another under unmapped_fields="load". Co-authored-by: Cursor <cursoragent@cursor.com>
x-pack/plugin/esql/qa/testFixtures/src/main/resources/unmapped-load.csv-spec
Show resolved
Hide resolved
x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Analyzer.java
Outdated
Show resolved
Hide resolved
| ; | ||
|
|
||
| partiallyUnmappedNonKeyword | ||
| required_capability: optional_fields_v2 |
| required_capability: index_metadata_field | ||
|
|
||
| SET unmapped_fields="load"\; | ||
| FROM sample_data, no_mapping_sample_data METADATA _index |
| ; | ||
|
|
||
| ####################### | ||
| # Type conflict tests # |
| # Type conflict tests # | ||
| ####################### | ||
|
|
||
| typeConflictKeywordUnmappedCastToKeyword |
| var convert = new ToDateNanos(f.source(), f, context.configuration()); | ||
| imf.types().forEach(type -> typeResolutions(f, convert, type, imf, typeResolutions)); | ||
| var resolvedField = ResolveUnionTypes.resolvedMultiTypeEsField(f, typeResolutions); | ||
| var resolvedField = ResolveUnionTypes.resolvedMultiTypeEsField(f, typeResolutions, null); |
There was a problem hiding this comment.
So while analyzing this, I uncovered an underlying pre-existing union type bug! #144870.
It looks like this is actually fine to be null (I added an assertion), but the other one isn't.
| indexToConversionExpressions, | ||
| fa.field().getTimeSeriesFieldType() | ||
| fa.field().getTimeSeriesFieldType(), | ||
| null // FIXME(gal, NOCOMMIT) This basically a copy constructor, shouldn't be null; write a test to flush it out. |
There was a problem hiding this comment.
This is actually fine, due to the behavior mentioned in #144870. I've added an assertion and comment instead.
| @@ -2633,7 +2653,7 @@ public LogicalPlan apply(LogicalPlan plan, AnalyzerContext context) { | |||
There was a problem hiding this comment.
This part though, is a bug! If some fields are unmapped and load is on, then not all types are actually dates! Fixed and added tests.
x-pack/plugin/esql/qa/testFixtures/src/main/resources/unmapped-load.csv-spec
Show resolved
Hide resolved
There was a problem hiding this comment.
Done, except TS. Now it's a race to see who finishes first :)
| indexToConversionExpressions, | ||
| fa.field().getTimeSeriesFieldType() | ||
| fa.field().getTimeSeriesFieldType(), | ||
| null |
There was a problem hiding this comment.
Added an assert and a comment why this should be null.
| if (indexMappingHashToDuplicateCount.compute(response.getIndexMappingHash(), (k, v) -> v == null ? 1 : v + 1) > 1) { | ||
| continue; | ||
| } | ||
| boolean isNew = indexMappingHashToDuplicateCount.compute( |
There was a problem hiding this comment.
This was a bug found by existing INSIST_🐔 tests! (The one I stupidly deleted because I didn't want to start fixing INSIST code 🤦.) If two indices share the same mapping, we still need to update fieldToMappedIndices. I've transferred the tests to use unmapped_fields=load instead.
This PR adds support for resolving type conflicts with unmapped fields. Similar to how [union types](https://www.elastic.co/docs/reference/query-languages/esql/esql-multi-index#esql-multi-index-union-types) work, this is resolved using explicit casts, except we have an additional constraint that potentially unmapped fields are treated as `KEYWORD`. In particular, if a field is Partially Unmapped and when it is mapped it's a Non-Keyword (or PUNK, as per @mouhc1ne), we have a type conflict, which may itself be conflicted if the field is mapped to different fields (in addition to the unmapped `KEYWORD`). Resolves elastic#141912. Resolves elastic#142004.
This PR adds support for resolving type conflicts with unmapped fields. Similar to how union types work, this is resolved using explicit casts, except we have an additional constraint that potentially unmapped fields are treated as
KEYWORD.In particular, if a field is Partially Unmapped and when it is mapped it's a Non-Keyword (or PUNK, as per @mouhc1ne), we have a type conflict, which may itself be conflicted if the field is mapped to different fields (in addition to the unmapped
KEYWORD).Resolves #141912.
Resolves #142004.